Automated generalisation in production at Ka- 
daster NL 



Vincent van Altena 1 , Ron Nijhuis 1 , Marc Post 1 , Ben Brims 1 , Jantien 
Stoter 1 ' 2 

1 Kadaster, The Netherlands, email: firstname.secondname@kadaster.nl 
12 0TB, Delft University of Technology, The Netherlands, 
j .e.stoter@tudelft.nl 



Abstract. This paper presents the implementation of a fully automated 
production workflow to generalise a 150k map from 110k data. The feasi- 
bility study for this workflow started in 2010 and has led to a production of 
a countrywide 150k map in 2013. From that moment on, the automatically 
generalised 150k map will replace the existing one. Because of the limited 
time needed to generalise the 150 k map series from l]0k data, a new 150 k 
map update is foreseen with every new release of 110k data, i.e. five times a 
year. 

Keywords: automated generalisation, cartography, multiscaletopographic 
data 



1. Introduction 

In 2010 the Netherlands' Kadaster, who also holds the national mapping 
agency, started a feasibility study to i ntroduce automated generalisation in 
its map production line. The study was motivated by the encountered prob- 
lems to meet the legal obligations of Kadaster within available budgets. The 
Kadaster is legally obliged to produce topographic vector data and raster 
maps at scale 110k, 150k, 1100k, 1250k, 1500k and 11000k in an up- 
date cycle of two years (or shorter). To meet this obligation Kadaster has 
converted its vectorised maps into object-oriented databases since 2007. 
The interactively update (i.e. generalisation) of these products by cartogra- 
phers takes too much time (and consequently costs) to meet the obliged 
update cycle within available budgets. For small-scale maps this problem is 
even bigger since maps at every scale are not generalised directly from the 
110k source, but in steps from the next larger scale map in a ladder ap- 
proach. Therefore small-scale maps require even longer time before they 
are updated. The time-consuming update process also has limitations to 



produce on-demand products (i.e. different products for different de- 
mands). 

The feasibility study to meet these problems has led to a fully automated 
workflow to produce 150k maps, which will be practiced from 2013. The 
following sections describe the Kadaster approach on introducing generali- 
sation in production. Section 2 describes the generic characteristics of the 
automated generalisation approach, Section 3 describes implementation 
detai I s and Secti on 4 cl oses wi th results, f i ndi ngs and future pi ans. 



2. The Kadaster automated generalisation approach 

The feasibility study on automated generalisation firstly focused on the 
workflow from object oriented 110k data (called TOP10NL) to a 150k map. 
Both source and target data cover the complete ski n of the earth (no gaps or 
overlap). The source data is shown in Figurel Now the feasibility study has 
been developed into a production line, the generalisation from 1100k map 
from 110k data is under development. 




Figure 1 Source data (LXOk), displayed at smaller scale 



From the beginning it was clear, that the aim of the automated generalisa- 
tion workflow should not be replicating the existing map. This had several 
reasons. 

Firstly, legacy topographic products- with an origin of often more than 60 
years ago - may overemphasis past (cartographic) requirements and may 



ignore new requirements of multiscale topographic information, eg. topo- 
graphic information is used in much more applications by a wider public 
than ever before and the user may prefer up-to-date maps over maps that 
meet all traditional cartographic principles (although the results should still 
be of acceptable quality). In addition automating a previously interactive 
process, which was designed for a past technical and organisational context, 
has appeared to be very complicated (Foerster et al, 2010; Stoter et al, 
2009b). Another reason to reconsider existing map requirements is that the 
time-saving aspect of automation makes the process well suited to produce 
various products for different demands. Therefore the requirements of au- 
tomatically generated multi-products may differ from the requirements of 
the existi ng si ngl e product that shoul d f i t al I uses. 

A few other aspects further refi ned the scope of the Kadaster approach: 

• The focus is on producing a map. Therefore disruptions of the ge- 
ometry to meet cartographic requirements do not have to be con- 
trol I ed apart from assuri ng thei r consi stency i n the resulti ng map, 

i .e. roads and water shoul d sti 1 1 form compl ete networks after gen- 
eralisation. To accomplish this, the workflow performs generalisa- 
tion usi ng a 'smart' partitioning, instead of the existing approach of 
generalising within map sheets (see Section 3.4). 

• Generalisation without any interaction is the best guarantee for effi- 
ciency and consistency and the only way to produce multi pie on- 
demand products. Therefore we do not al I ow that the automati cal ly 
achieved results are i nteracti vely i mproved afterwards. 

• The most strai ghtforward way for updates i n an 100% automated 
general i sati on workf I ow i s to compl etel y repl ace the ol d versi on 
map.Thisisin linewith Regnauld (2011). Therefore at this moment, 
we do not mai ntai n I i nks between the obj ects at the different seal e 

I evels. I f these I i nks are requi red, thi s wi 1 1 be part of a subsequent 
study. 

Si nee we reconsi der exi sti ng map specif i cati ons based on the potenti al s of 
technologies, one of the main challenges was how to redefine specifications 
for automated general i sati on taki ng existi ng gui del i nes as starti ng poi nt 
whi I e assuri ng that users requi rements are met. We have accompl i shed this 
in the foil owing way. 

Firstly, we generated an initial 150k map in a semi-automatic manner by 
extendi ng the work of Stoter etal (2009a; 2012). The aim of this first step 
was to see how much automation we can achieve with currently available 
tooling and some self- developed algorithms. This work implemented exist- 
ing generalisation gui deli nes for interactive generalisation in an automated 
process and improved the implementation by evaluating intermediate re- 
sults. 



Theinitial map was sent to a selection of main customers of the current 
150k map, who are formally organised in a users group, to test the main 
principles and assumptions. Based on obtained insights the process was 
improved and refined and implemented as one integrated workflow. I n a 
next stage the eval uati on and i improvement process was repeated, by aski ng 
more costumers to assess the resulti ng map i n more detai I and for different 
types of areas, i .e. 'Rel i ef and dense road pattern'; 'Compl i cated crossi ngs 
and dense parcel boundaries'; 'Dense water network' and 'Urban and indus- 
trial area'. 

Based on those evaluations and iterative testing, the optimal sequence of 
steps was determi ned as well as the most appropriate algorithms and pa- 
rameter values for each step, integrated in one workflow. 
I nteresti ngl y, the eval uati ons showed that the customers appreci ated the 
"same appearance of the map" less than "more frequent update cycles". I n- 
deed, they conf i rmed to be very pi eased that updated 1 50k maps wi 1 1 ap- 
pear two years earlier than currently is the case as well asthatthe!50k 
maps wi 1 1 be 100% consistent with the 1 10k source data because of the syn- 
chronised releases. Another interesting observation was that some results 
of the automated generalisation were appreci ated higher than the results of 
interactive generalisation. For example the automatically thinned road net- 
work appeared to be more appropriate for navigation than the interactively 
thinned road network. Finally several respondents were satisfied by im- 
proved uniformity of the wholemap. 



3. Implementation details 

Thissection describes the used software (section 3.1), pre-processing of the 
data (section 3.2), the implemented automated generalisation workflow 
(section 3.3) and finally the approach that was applied to automatically 
gen eral i se the whol e country ( secti on 3 .4) . 

3.1. Used software and technology 

For the implementation we use a mixture of standard ArcGI Stools, self- 
developed toolswithin Python and a series of FME tools. ArcGI S contai ns 
some specialised generalization tools for col lapsing two lanes of a road into 
a single road line, displacing symbolised geometries, simplifying of (sym- 
bolised) buildings and thinning of networks, see Punt and Watkins (2010). 
The compl ete general i zati on workf I ow i s i mpl emented wi thi n the M odel 
builder tool of ArcGI S. The workflow consists of three main models, con- 
sisting of about 200 sub models that are responsible for each specific gen- 
eral i zati on problem that we need to solve i n the process. 



3.2. Pre-processing the data 

Since the aim is 100% automation, the process should cover as many gener- 
alisation aspects as possible. This is accomplished by either improving the 
process step- by- step, or- if that did not work- by i mprovi ng and enriching 
the source data. Besides correcting (hidden) errors (resulting in TOP10NL 
basis), the enrichment of the source data (resulting in TOPlOExtra) is done 
in two ways. Either external data sources are used or the required 
knowl edge i s made expl i ci t by computati on . Exampl es of enri chments of 
the i nput data are determi ni ng urban extents by def i ni ng areas with hi gh 
density of buildings (i.e. higher than 10%) and attributingTOPlONL road 
segments with information on exits to better control the process that gener- 
alises the road network from the TOP10NL roads. 



3.3. Implemented workflow 

The i mpl emented automated general i sati on workf I ow consi sts of the f ol I ow- 
ing steps: 

(1) Model generalisation aiming at reducing the data that has to be visual- 
ised. This is the largest part of the process. Model generalisation is not 
only conversion of geometric objects to the lower density and structure 
of theTOP50 model, but also translation and reclassification of attrib- 
utes to theTOP50 model. The main operations are replacing road pol- 
ygons with road centrelines; merging individual road lanes into single 
line geometry; pruning the road and water network; and, general isa- 

ti on of smal 1 1 and use areas. 

(2) Symbolisation of the data. The symbol i sati on process assigns symbols 
to all geometries, as they should appear on the map. In this process ba- 
sis symbols are used whi ch exactly correspond to shape and outl i ne of 
portrayed features, but which lack all cartographic refinement. Sophis- 
ticated symbolisation is postponed to a later stage in the process. The 
symbol i sati on may result i n obj ects that appear I arger on the map than 
theyarein reality and in overlapping objects. This issolved inthenext 
step. 

Graphic generalisation to solve cartographic conflicts of symbolised objects. 
The graphic generalisation process consists of simplifying, typifying and 
displacing buildings and displacing thelinear objects (roads, water) and 
boundaries of symbolised water and terrain obj ects as well as all other point 
and linear objects (i.e. administrative boundaries, height contours, engi- 
neeri ng constructs) . At the end of the process polygon-obj ects are rebui It 
from the displaced boundaries and former codes are assigned to the new 
areas by usi ng I eft/ ri ght i nf ormati on of the boundari es. 



3.4. Countrywide coverage 

To be abl e to general i se a map for the whol e of The N etherl ands, the work- 
flow is applied on about 400 generated partitions. These partitions were 
generated using linear objects that must never be displaced, which are 
highways and mai n roads. I n contrast to map sheet boundaries, these 
boundari es al so appear i n the real worl d and they hardl y cl i p any obj ects. 
Near the coast, where these linear road objects are missing, artificial parti - 
ti ons have been made. Besi des some gl obal operati ons that are appl i ed for 
the whol e country (such as creati ng and si mpl ifyi ng the power I i ne net- 
work), the workflow is applied per partition and partitions are connected 
afterwards. Because vertices of objects at and near partition boundaries are 
prohi bited to move i n the displ acement process, the objects at nei ghbouri ng 
partitions still fit after generalisation. 

The generalization process of the 150k map from 110k source data for the 
whole country can be achieved in 50 hours. This is realised through the 
multiprocessing capabilities of ython, which allows parallel processing of 
six partitions on each of the six avail able systems. 



4. Results, findings and future plans 

Figure 4a shows the 150 k map that is generalised fully automatically with 
the resulting workflow from the data shown in Figurel Figure4b shows the 
interactively generalised version for a global comparison. 




Figure 4b: 150k map, interactively generalised; displayed at smaller scale 

Based on the results and very good users' eval uati ons the Kadaster deci ded 
that a fully automated generalisation workflow is the most sustainable 
workflow for the future as well as they only way to produce products on 
demand. 

As mentioned above, the thirty-six parallel Python processes perform the 
core generalising process of the complete Netherlands within approximate- 
ly 50 hours. Including pre-processing, generalisation, visualisation and 
printing the whole turnaround is 3 weeks (at most) for the whole country. 
Therefore a 150k update is foreseen with every new delivery of TOP10NL 
(fiveti mesa year). 

Based on the experiences with the new 150k product, the automated gen- 
eralisation approach is currently being extended to the 1100k map and to 
on-demand products, such as the backdrop map at multi pie (between 7 and 
16) scales for the national geo-portal. 



5. References 

FoersterT., J .E. Stoter and M Kraak. 2010. Challenges for Automated Gen- 
eralisation at European Mapping Agencies: A Qualitative and Quantitative 
Analysis, In: The Cartographic J ournal, Volume47, 1 pp. 41-54 



Punt, E. and D. Watkins (2010). user- directed generalization of roads and 
buildings for multi-scaletopography, 13th ICA Workshop on Generalisation 
and Multiple Representation, 2010 Zurich 



Regnauld, N. (201]). OS Vectormap district: automated generalisation, text 
placement and conflation in support of making pubic data public, presented 
at the 25th I nternational Cartographic Conference, J uly, 2011 Paris. Avail- 
able on line: 

http:// icaci .org/ fi les/ documents/ 1 CC_ proceedi ngs/ 1 CC201]/ Oral%20Pres 
entations%20PDF/D3-Generalisati on/ CO-358.pdf 



Stoter, J E, Smaalen, J van, Nijhuis, R, Dortland, A, Bulder J & Bruns, B 
(2012). Fully autmated generalisation of topographic data in current geo- 
information environments. In SZlatanova, H Ledoux, EM Fendel & M Ru- 
mor (Eds.), Urban and Regional Data Management - UDMS Annual 2011 
(pp. 109-120). Leiden: CRC Press- Taylor & Francis Group. 



Stoter, J .E., J . van Smaalen, N. Bakker, P. Hardy (2009a). Specifying map 
requirements for automated generalisation of topographic data, The Carto- 
graphic J ournal Vol. 46 No. 3 pp. 214- 227 August 2009 



Stoter, J .E., D. Burghardt, C. Duchene, B. Baella, N. Bakker, C. Blok, M. Pla, 
N. Regnauld, G. Touya, S. Schmid (2009b). Methodology for evaluating 
automated map generalization in commercial software, Pages 311-324 In: 
Computers, Environment and Urban Systems Volume 33, Issue 5, Septem- 
ber 2009. 



